Vocabulary-independent recognition of american Spanish phrases and digit strings
نویسندگان
چکیده
We describe the development of an R&D recognizer for several Spanish applications, starting from an existing recognition system for American English and modest language-speci c resources. The experiments emphasize achieving phonetic accuracy on telephone speech without vocabulary speci c training. We use our basic recognition engine, and simple grammar-building tools for predicting word sequences. Only the read sentences from two telephone speech corpora (Voice Across Hispanic America (VAHA) and a smaller TI corpus) are used for training. Word error rates (WER) of 1.9% on telephone service command phrases, 5.5% on telephone numbers, and 12% on continuously spoken sentences are achieved with the newly ported system.
منابع مشابه
The CSLU speaker recognition corpus
This paper describes the CSLU Speaker Recognition Corpus data collection. The corpus was motivated by a need for speech data from many speakers, under different environmental conditions, with each speaker providing data over a significant period of time. The corpus was designed to provide sufficient data to study phonetic variability within and across sessions, and to design and evaluate system...
متن کاملAn embedded word training procedure for connected digit recognition
The "conventional" way of obtaining word reference patterns for connected word recognition systems is to use isolatàd word patterns, and to rely on the dynamics of the matching algorithm to account for the differences in connected speech. Connected word recognition, based on such an approach, tends to become unreliable (high error rates) when the talking rate becomes grossly incommensurate with...
متن کاملDNN i-Vector Speaker Verification with Short, Text-Constrained Test Utterances
We investigate how to improve the performance of DNN ivector based speaker verification for short, text-constrained test utterances, e.g. connected digit strings. A text-constrained verification, due to its smaller, limited vocabulary, can deliver better performance than a text-independent one for a short utterance. We study the problem with “phonetically aware” Deep Neural Net (DNN) in its cap...
متن کاملRecognition of digit strings in noisy speech with limited resources
Automatic recognition of continuously-spoken digits (e.g., telephone numbers or credit card numbers) is feasible with excellent accuracy, even for speaker-independent applications over telephone lines. However, even such relatively simple recognition tasks su er decreased performance in adverse conditions, such as signi cant background noise or fading on portable telephone channels. If an appli...
متن کاملInvestigations on discriminative training criteria
In this work, a framework for efficient discriminative training and modeling is developed and implemented for both small and large vocabulary continuous speech recognition. Special attention will be directed to the comparison and formalization of varying discriminative training criteria and corresponding optimization methods, discriminative acoustic model evaluation and feature extraction. A fo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997